Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Enhance compatibility with different schema versions #244

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

amorenoz
Copy link
Collaborator

@amorenoz amorenoz commented Oct 13, 2021

Currently, there are several limitations that forces libovsdb to be very strict with the db schema version it uses. These limitations mainly come from the fact that:

  • mapper.Info structs are created all over the codebase solely based on the schema (ignoring the provided model)
  • Although there are many internal structs where we keep a reference to a model.DatabaseModel and the ovsdb.Schema, we lack a struct that centralizes both and can make decisions based on their compatibility. This has an additional challenge because the schema and the model are obtained at two different points in time.

This PR performs two mayor code refactorings on top of which a schema compatibility version is implemented.

Refactor 1: Rename current DatabaseModel to DatabaseModelRequest and introduce an internal DatabaseModel type

All over the codebase we keep references to both the ovsdb.Schema and the model.DatabaseModel. In fact, server.go did one step further and defined:

type DatabaseModel struct {
Model *model.DBModel
Schema *ovsdb.DatabaseSchema
}

The first code refactoring proposed is to rename the client-provided DatabaseModel to DatabaseModelRequest. The naming could very well be discussed, my rationale is that the client would be "requesting" that model. Such request might not be 100% fulfilled depending on the schema. After such rename, a DatabaseModel is introduced that combines both central objects that are core for the library and that supports 2-step initialization to allow the schema to be provided afterwards.

Refactor 2: mapper.Mapper to use mapper.Info

All over the codebase we create mapper.Info structs and then call mapper.Mapper functions which create mapper.Info structs again. Unify this into a more compact API by having mapper.Mapper functions accept a mapper.Info reference.

Functional enhancements

With these two refactorings in place, this PR optimizes the mapper.Info creation process by having the newly introduced DatabaseModel do it. Since this struct now has both the schema and the model, it has more control over what fields will actually be readable/writable. Besides, we can cache the Info's metadata and speed up creation (I don't expect a huge performance improvement in unit tests since they use small models).

Finally, this PR introduces a Compatibility mode by adding a flag into the DatabaseModelRequest. That way the client can request the Model to be more flexible wrt the schema that is supported. On this mode, columns with conflicting types, missing columns (not present in the schema) are omitted and so are missing tables.

Fixes: #235

MonitorCookies are marshalled as a json object, not an array. Fix it so
that benchmark test works

Signed-off-by: Adrian Moreno <[email protected]>
Clearly DBModel does not hold the full database model. Instead, only
the combination of model.DBModel, mapper.Mapper and ovsdb.Schema is a
useful database model.

The fact that server.go had to defined a struct called DatabaseModel
with model.DBModel and ovsdb.Schema and dymanically create Mapper
objects from them is a proof of this.

In order to prepare for a DBModel refactoring, rename it to
DatabaseModelRequest as it's what the client requests the DatabaseModel
to look like.

This patch does not contain functional changes

Signed-off-by: Adrian Moreno <[email protected]>
Replace the one that server.go had to define. For now, it's just a
drop-in replacement of the previous type

Signed-off-by: Adrian Moreno <[email protected]>
It is common to first create a DatabaseModel only based on the
DatabaseModelRequest, and then add / remove the schema to it when, e.g:
when the client (re) connects.

Signed-off-by: Adrian Moreno <[email protected]>
For the cache, it's simply replacing three fields with one
For the client, use the 2-step DatabaseModel initialization

Signed-off-by: Adrian Moreno <[email protected]>
Now that client, cache and server uses the DatabaseModel as central
point of model creation and introspection, we can hide the
DatabaseModelRequest and move its pulbic functions to the DatabaseModel

Signed-off-by: Adrian Moreno <[email protected]>
All around the codebase we're creating mapper.Info structures and then
calling Mapper functions that create Info structures again.

Given that mapper.Info already defines all the metadata that Mapper
needs to do the native-to-ovs transations, it makes sense to use Info
structures as input to all functions.

That simplifies the code inside the mapper module. Also, I'd expect some
performance improvement since we were creating multiple Info structs
unnecessarily in the host path.

It's true that, for now, it makes it sligthly more cumbersone to call
mapper functions, since the Info struct has to be created first and it
now requires an additional argument (the table name). However, this
can be improved later on by having the database model build the
Info structs for us.

Signed-off-by: Adrian Moreno <[email protected]>
The core mapper API uses mapper.Info sctructs which can be created just
by inspecting a TableSchema.

However, having the DatabaseModel now centralizing accesses to the
mapper API and containing both the Model types and the Schema, we can
pre-create the mapper.Info.Metadata sctructs and cache them so we create
Info sctructs more efficiently

Signed-off-by: Adrian Moreno <[email protected]>
Allow the user to set a Compatibility flag on the DatabaseModelRequest.

When that flag is set, the verification phase will not fail if a column
is missing on the schema or has a different type. Instead it will just
skip the column.

Same goes for missing tables, they will just be skipped.

Signed-off-by: Adrian Moreno <[email protected]>
amorenoz added a commit to amorenoz/ovn-kubernetes that referenced this pull request Oct 13, 2021
It enables compatibility mode to cope with schema changes. This has to
be revisited once/if the relevant PR get's merged:

ovn-org/libovsdb#244

Signed-off-by: Adrian Moreno <[email protected]>
@dave-tucker
Copy link
Collaborator

Thanks @amorenoz

Refactor 1: Rename current DatabaseModel to DatabaseModelRequest and introduce an internal DatabaseModel type
The first code refactoring proposed is to rename the client-provided DatabaseModel to DatabaseModelRequest. The naming could very well be discussed, my rationale is that the client would be "requesting" that model. Such request might not be 100% fulfilled depending on the schema. After such rename, a DatabaseModel is introduced that combines both central objects that are core for the library and that supports 2-step initialization to allow the schema to be provided afterwards.

So firstly, combining the Model/Schema to a single struct makes a lot of sense. I'm struggling with the 2-step initialization though.

Currently we ensure that what we're connecting to maps 1:1 with what the server offers. This is important because the datatypes that comprise a DatabaseModel are used directly. Put differently, I can always ovs.Create(&Bridge{}) if I've created a model for Bridge. While Connect() could fail at runtime, once I'm connected there are no surprises.

If we move to a 2-step approach, where what's passed to NewOVSDBClient is now a DatabaseModelRequest, the API now acts differently. ovs.Connect() will succeed, but ovs.Create(&Bridge{}):

  • Can succeed
  • Can fail at runtime if the server doesn't support that table.
  • It can partially succeed but may omit columns that I've mapped because either the server doesn't support the column OR the server does support the column but the types are different

My issue with this is that it weakens the guarantees we had before 😢 and also leads to some rather confusing runtime behaviours. I was also very fond of the fact that the DatabaseModel abstraction eliminated the need to know about tables names, columns etc... You just work with structs. In order to ease the rough edges of dealing with different runtime behaviours we will have to expose HasColumn, HasTable, IsSameTypeAsSchema APIs for users to conditionally build version support logic in to their applications which will end up very messy in the long term. If we head down this road I feel we'd be better off not using Model at all and reverting to the old behaviour of using a map[string]interface{}.

Finally, this PR introduces a Compatibility mode by adding a flag into the DatabaseModelRequest. That way the client can request the Model to be more flexible wrt the schema that is supported. On this mode, columns with conflicting types, missing columns (not present in the schema) are omitted and so are missing tables.

The concern I have here is that if you enable the bool you have to be prepared to deal with the cases explained above.
I worry that flipping a bool is "too easy" and can lead to unexpected runtime behaviour as mentioned earlier.

Ultimately my position is:

  1. Upgrade ovsdb-server before you upgrade your clients
  2. Use modelgen and application versioning to ensure that your app (using libovsdb) is compatible with your target server version

If we really really need to have a version of libovsdb that can speak different versions of the same schema, then I think the correct approach is to make it easy to register/maintain a model per version but I've not had to the time to design how this might work.

@amorenoz
Copy link
Collaborator Author

Thanks @amorenoz

Refactor 1: Rename current DatabaseModel to DatabaseModelRequest and introduce an internal DatabaseModel type
The first code refactoring proposed is to rename the client-provided DatabaseModel to DatabaseModelRequest. The naming could very well be discussed, my rationale is that the client would be "requesting" that model. Such request might not be 100% fulfilled depending on the schema. After such rename, a DatabaseModel is introduced that combines both central objects that are core for the library and that supports 2-step initialization to allow the schema to be provided afterwards.

So firstly, combining the Model/Schema to a single struct makes a lot of sense. I'm struggling with the 2-step initialization though.

Well, 2-step initialization is happening now since the shema Validation happens at Connect()

Currently we ensure that what we're connecting to maps 1:1 with what the server offers. This is important because the datatypes that comprise a DatabaseModel are used directly. Put differently, I can always ovs.Create(&Bridge{}) if I've created a model for Bridge. While Connect() could fail at runtime, once I'm connected there are no surprises.

The 2-step initialization of the DatabaseModel is perfectly compatible with the current validation criteria. It's just a way to add the schema to the struct after it has been created. We can make Connect fail as it does (in fact, without the last patch, it is what this PR does). In fact, it's even more strict since it centralizes the creation of Info structures avoiding crashes on cache updates (e.g. running cmd/stress against examples/ovsdb-server crashes whith panic: FieldByColumn: column dpdk_initialized not found in orm info)

If we move to a 2-step approach, where what's passed to NewOVSDBClient is now a DatabaseModelRequest, the API now acts differently. ovs.Connect() will succeed, but ovs.Create(&Bridge{}):

* Can succeed

* Can fail at runtime if the server doesn't support that table.

* It can partially succeed but may omit columns that I've mapped because either the server doesn't support the column OR the server does support the column but the types are different

My issue with this is that it weakens the guarantees we had before cry and also leads to some rather confusing runtime behaviours. I was also very fond of the fact that the DatabaseModel abstraction eliminated the need to know about tables names, columns etc... You just work with structs. In order to ease the rough edges of dealing with different runtime behaviours we will have to expose HasColumn, HasTable, IsSameTypeAsSchema APIs for users to conditionally build version support logic in to their applications which will end up very messy in the long term. If we head down this road I feel we'd be better off not using Model at all and reverting to the old behaviour of using a map[string]interface{}.

I agree, it weakens a guarantee that was very useful. However, if the application is not able to guarantee it will be updated before the server, it will have to code around the possibility of different schema versions anyhow (even if it's by using another &Bridge type) but guess it's still more robust that having to query for each column to know if it's really empty or not supported...

Finally, this PR introduces a Compatibility mode by adding a flag into the DatabaseModelRequest. That way the client can request the Model to be more flexible wrt the schema that is supported. On this mode, columns with conflicting types, missing columns (not present in the schema) are omitted and so are missing tables.

The concern I have here is that if you enable the bool you have to be prepared to deal with the cases explained above. I worry that flipping a bool is "too easy" and can lead to unexpected runtime behaviour as mentioned earlier.

I agree. The flag was just to enable the OVN feature testing.

Ultimately my position is:

1. Upgrade ovsdb-server before you upgrade your clients

2. Use `modelgen` and application versioning to ensure that your app (using libovsdb) is compatible with your target server version

If we really really need to have a version of libovsdb that can speak different versions of the same schema, then I think the correct approach is to make it easy to register/maintain a model per version but I've not had to the time to design how this might work.

Just throwing another idea: maybe a minimum and maximum DatabaseModel? The client tires the maximum, if it fails to validate, tries the minimum, it it also fails Connect() fails. The client then returns which model was selected.

Anyhow, as I said above, the refactoring has nothing to do with the validation strictness, and I think it makes the code cleaner (and fixes a crash :)). So if you think it's worth it, let me know and I'll create another PR without the last commit.

@dave-tucker
Copy link
Collaborator

So if you think it's worth it, let me know and I'll create another PR without the last commit.

@amorenoz I absolutely think it's worth it. Less bugs and crashes are awesome 😆

I'm still not a fan of DatabaseModelRequest but don't have a better suggestion, so let's run with it for now and we can figure it out later.

@amorenoz
Copy link
Collaborator Author

So if you think it's worth it, let me know and I'll create another PR without the last commit.

@amorenoz I absolutely think it's worth it. Less bugs and crashes are awesome laughing

I'm still not a fan of DatabaseModelRequest but don't have a better suggestion, so let's run with it for now and we can figure it out later.

Me neither, I hate it actually. How about ClientDatabaseModel (and we internally build the DatabaseModel that contains the ClientDatabaseModel, the server schema and the mapper)?

@dave-tucker
Copy link
Collaborator

@amorenoz yep I prefer ClientDatabaseModel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

model: Connecting to DBs that don't contain tables that are in the model should not be an error.
2 participants